Geographically Organized Small Communities and the Hardness of Clustering Social Networks

نویسندگان

  • Miklós Kurucz
  • András A. Benczúr
چکیده

Spectral clustering, while perhaps the most efficient heuristics for graph partitioning, has recently gathered bad reputation for failure over large-scale power law graphs. In this chapter we identify the abundance of small-size communities connected by long tentacles as the major obstacle for spectral clustering. These subgraphs hide the higher level structure and result in a highly degenerate adjacency matrix with several hundreds of eigenvalues very close to 1. Our results on clustering social networks, telephone call graphs, and Web graphs are twofold. (1) We show that graphs generated by existing social network models are not as difficult to cluster as they are in the real world. For this end we give a new combined model that yields degenerate adjacency matrices and hard-to-partition graphs. (2) We give heuristics for spectral clustering for large-scale real-world social networks that handle tentacles and small dense communities. Our algorithm outperforms all previous methods for power law graph partitioning both in speed and in cluster quality. In a combination of heuristics for the contraction of tentacles as well as the removal of community cores that involve the recent SCAN (Structural Clustering Algorithm for Networks) algorithm, we are able to efficiently find balanced partitioning of over 10 million edge power law graphs. In particular, our heuristics promise similar or better performance than semidefinite relaxation with orders of magnitude lower running time. 10.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Overlapping Communities in Social Networks using Deep Learning

In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...

متن کامل

An Optimized Firefly Algorithm based on Cellular Learning Automata for Community Detection in Social Networks

The structure of the community is one of the important features of social networks. A community is a sub graph which nodes have a lot of connections to nodes of inside the community and have very few connections to nodes of outside the community. The objective of community detection is to separate groups or communities that are linked more closely. In fact, community detection is the clustering...

متن کامل

The anatomy of urban social networks and its implications in the searchability problem

The appearance of large geolocated communication datasets has recently increased our understanding of how social networks relate to their physical space. However, many recurrently reported properties, such as the spatial clustering of network communities, have not yet been systematically tested at different scales. In this work we analyze the social network structure of over 25 million phone us...

متن کامل

Sampling from social networks’s graph based on topological properties and bee colony algorithm

In recent years, the sampling problem in massive graphs of social networks has attracted much attention for fast analyzing a small and good sample instead of a huge network. Many algorithms have been proposed for sampling of social network’ graph. The purpose of these algorithms is to create a sample that is approximately similar to the original network’s graph in terms of properties such as de...

متن کامل

تشخیص اجتماعات ترکیبی در شبکه‌های اجتماعی

One of the great challenges in Social Network Analysis (SNA) is community detection. Community is a group of vertices which have high intra connections and sparse inter connections. Community detection or Clustering reveals community structure of social networks and hidden relationships among their constituents. By considering the increase of datasets related to social networks, we need scalabl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010